AITopics | image parsing

Submodular Field Grammars: Representation, Inference, and Application to Image Parsing

Neural Information Processing SystemsNov-20-2025, 22:57:53 GMT

Natural scenes contain many layers of part-subpart structure, and distributions over them are thus naturally represented by stochastic image grammars, with one production per decomposition of a part. Unfortunately, in contrast to language grammars, where the number of possible split points for a production $A \rightarrow BC$ is linear in the length of $A$, in an image there are an exponential number of ways to split a region into subregions. This makes parsing intractable and requires image grammars to be severely restricted in practice, for example by allowing only rectangular regions. In this paper, we address this problem by associating with each production a submodular Markov random field whose labels are the subparts and whose labeling segments the current object into these subparts. We call the result a submodular field grammar (SFG). Finding the MAP split of a region into subregions is now tractable, and by exploiting this we develop an efficient approximate algorithm for MAP parsing of images with SFGs. Empirically, we present promising improvements in accuracy when using SFGs for scene understanding, and show exponential improvements in inference time compared to traditional methods, while returning comparable minima.

application, representation, submodular field grammar, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.90)

Add feedback

Submodular Field Grammars: Representation, Inference, and Application to Image Parsing

Neural Information Processing SystemsOct-8-2024, 19:09:22 GMT

Natural scenes contain many layers of part-subpart structure, and distributions over them are thus naturally represented by stochastic image grammars, with one production per decomposition of a part. Unfortunately, in contrast to language grammars, where the number of possible split points for a production A \rightarrow BC is linear in the length of A, in an image there are an exponential number of ways to split a region into subregions. This makes parsing intractable and requires image grammars to be severely restricted in practice, for example by allowing only rectangular regions. In this paper, we address this problem by associating with each production a submodular Markov random field whose labels are the subparts and whose labeling segments the current object into these subparts. We call the result a submodular field grammar (SFG).

image parsing, representation, submodular field grammar, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.70)
Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback

Reviews: Submodular Field Grammars: Representation, Inference, and Application to Image Parsing

Neural Information Processing SystemsOct-8-2024, 03:21:43 GMT

The key problem is that splitting the image into *arbitrarily-shaped* pixel regions to associate with the production rules is computationally difficult in general. This paper proposes to associate formal grammar production rules with submodular Markov random fields (MRF). The submodular structure of the associated MRF allows for fast inference for a single rule into arbitrarily-shaped subregions and a dynamic-programming-like algorithm for parsing the entire image structure. The experimental results show that the method is indeed much faster than previous methods. Pros: 1) Well-written and easy to read even though some of the details are fairly technical.

grammar, representation, submodular field grammar, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.67)

Add feedback

Image Parsing with Stochastic Scene Grammar

Neural Information Processing SystemsApr-6-2023, 12:52:01 GMT

In contrast to previous scene labeling work that applied discriminative classifiers to pixels (or super-pixels), we use a generative Stochastic Scene Grammar (SSG). This grammar represents the compositional structures of visual entities from scene categories, 3D foreground/background, 2D faces, to 1D lines. The grammar includes three types of production rules and two types of contextual relations. Production rules: (i) AND rules represent the decomposition of an entity into sub-parts; (ii) OR rules represent the switching among sub-types of an entity; (iii) SET rules rep- resent an ensemble of visual entities. Contextual relations: (i) Cooperative " " relations represent positive links between binding entities, such as hinged faces of a object or aligned boxes; (ii) Competitive "-" relations represents negative links between competing entities, such as mutually exclusive boxes. We design an efficient MCMC inference algorithm, namely Hierarchical cluster sampling, to search in the large solution space of scene configurations. The algorithm has two stages: (i) Clustering: It forms all possible higher-level structures (clusters) from lower-level entities by production rules and contextual relations.

algorithm, image parsing, stochastic scene grammar, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.85)
Information Technology > Artificial Intelligence > Machine Learning (0.58)

Add feedback

Image Parsing with Stochastic Scene Grammar

Zhao, Yibiao, Zhu, Song-chun

Neural Information Processing SystemsFeb-14-2020, 21:27:18 GMT

In contrast to previous scene labeling work that applied discriminative classifiers to pixels (or super-pixels), we use a generative Stochastic Scene Grammar (SSG). This grammar represents the compositional structures of visual entities from scene categories, 3D foreground/background, 2D faces, to 1D lines. The grammar includes three types of production rules and two types of contextual relations. Production rules: (i) AND rules represent the decomposition of an entity into sub-parts; (ii) OR rules represent the switching among sub-types of an entity; (iii) SET rules rep- resent an ensemble of visual entities. Contextual relations: (i) Cooperative " " relations represent positive links between binding entities, such as hinged faces of a object or aligned boxes; (ii) Competitive "-" relations represents negative links between competing entities, such as mutually exclusive boxes. We design an efficient MCMC inference algorithm, namely Hierarchical cluster sampling, to search in the large solution space of scene configurations. The algorithm has two stages: (i) Clustering: It forms all possible higher-level structures (clusters) from lower-level entities by production rules and contextual relations.

algorithm, image parsing, stochastic scene grammar, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.85)
Information Technology > Artificial Intelligence > Machine Learning (0.62)

Add feedback

Submodular Field Grammars: Representation, Inference, and Application to Image Parsing

Friesen, Abram L., Domingos, Pedro M.

Neural Information Processing SystemsFeb-14-2020, 14:27:47 GMT

Natural scenes contain many layers of part-subpart structure, and distributions over them are thus naturally represented by stochastic image grammars, with one production per decomposition of a part. Unfortunately, in contrast to language grammars, where the number of possible split points for a production $A \rightarrow BC$ is linear in the length of $A$, in an image there are an exponential number of ways to split a region into subregions. This makes parsing intractable and requires image grammars to be severely restricted in practice, for example by allowing only rectangular regions. In this paper, we address this problem by associating with each production a submodular Markov random field whose labels are the subparts and whose labeling segments the current object into these subparts. We call the result a submodular field grammar (SFG).

image parsing, representation, submodular field grammar, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.69)
Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Filters

Collaborating Authors

image parsing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Submodular Field Grammars: Representation, Inference, and Application to Image Parsing

Submodular Field Grammars: Representation, Inference, and Application to Image Parsing

Reviews: Submodular Field Grammars: Representation, Inference, and Application to Image Parsing

Image Parsing with Stochastic Scene Grammar

Image Parsing with Stochastic Scene Grammar

Submodular Field Grammars: Representation, Inference, and Application to Image Parsing